skip to main content


Search for: All records

Creators/Authors contains: "Ramsdell, Jordan"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Hauff, Claudia Curry (Ed.)
    Given a text with entity links, the task of entity aspect linking is to identify which aspect of an entity is referred to in the context. For example, if a text passage mentions the entity "USA'', is USA mentioned in the context of the 2008 financial crisis, American cuisine, or else? Complementing efforts of Nanni et al (2018), we provide a large-scale test collection which is derived from Wikipedia hyperlinks in a dump from 01/01/2020. Furthermore, we offer strong baselines with results and broken-out feature sets to stimulate more research in this area. Data, code, feature sets, runfiles and results are released under a CC-SA license and offered on our aspect linking resource web page http://www.cs.unh.edu.unh.idm.oclc.org/~dietz/eal-dataset-2020/ 
    more » « less
  2. Abstract Motivation

    Whole metagenome shotgun sequencing is a powerful approach for assaying the functional potential of microbial communities. We currently lack tools that efficiently and accurately align DNA reads against protein references, the technique necessary for constructing a functional profile. Here, we present PALADIN—a novel modification of the Burrows-Wheeler Aligner that provides accurate alignment, robust reporting capabilities and orders-of-magnitude improved efficiency by directly mapping in protein space.

    Results

    We compared the accuracy and efficiency of PALADIN against existing tools that employ nucleotide or protein alignment algorithms. Using simulated reads, PALADIN consistently outperformed the popular DNA read mappers BWA and NovoAlign in detected proteins, percentage of reads mapped and ontological similarity. We also compared PALADIN against four existing protein alignment tools: BLASTX, RAPSearch2, DIAMOND and Lambda, using empirically obtained reads. PALADIN yielded results seven times faster than the best performing alternative, DIAMOND and nearly 8000 times faster than BLASTX. PALADIN's accuracy was comparable to all tested solutions.

    Availability and Implementation

    PALADIN was implemented in C, and its source code and documentation are available at https://github.com/twestbrookunh/paladin

    Supplementary information

    Supplementary data are available at Bioinformatics online.

     
    more » « less